Session G-7

TCP and Congestion Control

Conference
8:30 AM — 10:00 AM EDT
Local
May 19 Fri, 5:30 AM — 7:00 AM PDT
Location
Babbio 221

i-NVMe: Isolated NVMe over TCP for a Containerized Environment

Seongho Lee, Ikjun Yeom and Younghoon Kim (Sungkyunkwan University, Korea (South))

0
Non-Volatile Memory Express (NVMe) over TCP is an efficient technology for accessing remote Solid State Drives (SSDs); however, it may cause a serious interference issue when used in a containerized environment. In this study, we propose a performance isolation scheme for NVMe over TCP in such an environment. The proposed scheme measures the CPU usage of the NVMe over TCP worker, charges it to containers in proportion to their NVMe traffic, and schedules containers to ensure isolated sharing of the CPU. However, because the worker runs with a higher priority than normal containers, it may not be possible to achieve performance isolation with container scheduling alone. To solve this problem, we also control the CPU usage of the worker by throttling NVMe over TCP traffic. The proposed scheme is implemented on a real testbed for evaluation. We perform extensive experiments with various workloads and demonstrate that the scheme can provide performance isolation even in the presence of excessive NVMe traffic.
Speaker Lee Seongho (Sungkyunkwan University, South Korea)

He is currently working toward the Ph.D. degree in computer science at Sungkyunkwan University, South Korea. His research interests include optimizing containerized environments and the CPU scheduling.


Congestion Control Safety via Comparative Statics

Pratiksha Thaker (Carnegie Mellon University, USA); Tatsunori Hashimoto and Matei Zaharia (Stanford University, USA)

0
When congestion control algorithms compete on shared links, unfair outcomes can result, especially between algorithms that aim to prioritize different objectives. For example, a throughput-maximizing application could make the link completely unusable for a latency-sensitive application. In order to study these outcomes formally, we model the congestion control problem as a game in which agents have heterogeneous utility functions. We draw on the comparative statics literature in economics to derive simple and practically useful conditions under which all agents achieve at least \(\epsilon\) utility at equilibrium, a minimal safety condition for the network to be useful for any application. Compared to prior analyses of similar games, we show that our framework supports a more realistic class of utility functions that includes highly latency-sensitive applications such as teleconferencing and online gaming.
Speaker Pratiksha Thaker (Carnegie Mellon University)

Pratiksha Thaker is a postdoctoral researcher at Carnegie Mellon University. She is interested in applying tools from learning theory and game theory to practical systems problems.


Gemini: Divide-and-Conquer for Practical Learning-Based Internet Congestion Control

Wenzheng Yang and Yan Liu (Tencent, China); Chen Tian (Nanjing University, China); Junchen Jiang (University of Chicago, USA); Lingfeng Guo (The Chinese University of Hong Kong, Hong Kong)

0
Learning-based Internet congestion control algorithms have attracted much attention due to their potential performance improvement over traditional algorithms. However, such performance improvement is usually at the expense of black-box design and high computational overhead, which prevent them from large-scale deployment over production networks. To address this problem, we propose a novel Internet congestion control algorithm called Gemini. It contains a parameterized congestion control module, which is white-box designed with low computational overhead, and an online parameter optimization module, which serves to adapt the parameterized congestion control module to different networks for higher transmission performance. Extensive trace-driven emulations reveal Gemini achieves better balances between delay and throughput than state-of-the-art algorithms. Moreover, we successfully deploy Gemini over production networks. The evaluation results show that the average throughput of Gemini is 5% higher than that of Cubic (4% higher than that of BBR) over a mobile application downloading service and 61% higher than that of Cubic (33% higher than that of BBR) over a commercial network speed-test benchmarking service.
Speaker Wenzheng Yang (Nanjing University and Tencent, China)



Marten: A Built-in Security DRL-Based Congestion Control Framework by Polishing the Expert

Zhiyuan Pan and Jianer Zhou (SUSTech, China); XinYi Qiu ( & Peng Cheng Laboratory, China); Weichao Li (Peng Cheng Laboratory, China); Heng Pan (Institute of Computing Technology, Chinese Academy of Sciences, China); Wei Zhang (The National Computer Network Emergency Response Technical Team Coordination Center of China, China)

0
Deep reinforcement learning (DRL) has been proved to be an effective method to improve the congestion control algorithms (CCAs). However, the lack of training data and training scale affect the effectiveness of DRL model. Combining rule-based CCAs (such as BBR) as a guide for DRL is an effective way to improve learning-based CCAs. By experiment measurement, we find that the rule-based CCAs limit the action exploration and even trigger wrong actions to gain higher DRL's reward gain. To overcome the constraints, we propose Marten, a framework which improves the effectiveness of rule-based CCAs for DRL. Marten uses entropy as the degree of exploration and uses it to expand the exploration of DRL. Furthermore, Marten introduces the safety mechanism to avoid wrong DRL actions. We have implemented Marten in both simulation platform OpenAI Gym and deployment platform QUIC. The experimental results in production network demonstrate Marten can improve throughput by 11% and reduce latency by 8% on average compared with Eagle.
Speaker Zhiyuan Pan (Sothern University of Science and Technology)

Zhiyuan Pan is studying for a master's degree at Southern University of Science and Technology. His main research topics are network congestion control algorithms and deep reinforcement learning algorithms.


Session Chair

Ehab Al-Shaer

Session G-9

Cloud/Edge Computing 2

Conference
1:30 PM — 3:00 PM EDT
Local
May 19 Fri, 10:30 AM — 12:00 PM PDT
Location
Babbio 221

TanGo: A Cost Optimization Framework for Tenant Task Placement in Geo-distributed Clouds

Luyao Luo, Gongming Zhao and Hongli Xu (University of Science and Technology of China, China); Zhuolong Yu (Johns Hopkins University, USA); Liguang Xie (Futurewei Technologies, USA)

0
Cloud infrastructure has gradually displayed a tendency of geographical distribution in order to provide anywhere, anytime connectivity to tenants all over the world. The tenant task placement in geo-distributed clouds comes with three critical and coupled factors: regional diversity in electricity prices, access delay for tenants, and traffic demand among tasks. However, existing works disregard either the regional difference in electricity prices or the tenant requirements for tasks in geo-distributed clouds, resulting in increased operating costs or low user QoS.
To bridge the gap, we design a cost optimization framework for tenant task placement in geo-distributed clouds, called TanGo. However, it is non-trivial to achieve an optimization framework while meeting all the tenant requirements. To this end, we first formulate the electricity cost minimization for task placement problem as a constrained mixed-integer non-linear programming problem. We then propose a near-optimal algorithm with a tight approximation ratio (1-1/e) using an effective submodular-based method. Results of in-depth simulations based on real-world datasets show the effectiveness of our algorithm as well as the overall 10\%-30\% reduction in electricity expenses compared to commonly-adopted alternatives.
Speaker Zhenguo Ma (University of Science and Technology of China)

Zhenguo Ma received the B.S. degree in software engineering from the Shandong University, China, in 2018. He is currently pursuing his Ph.D. degree in the School of Computer Science and Technology, University of Science and Technology of China. His research interests include cloud computing, edge computing and federated learning.


An Approximation for Job Scheduling on Cloud with Synchronization and Slowdown Constraints

Dejun Kong and Zhongrui Zhang (Shanghai Jiao Tong University, China); Yangguang Shi (Shandong University, China); Xiaofeng Gao (Shanghai Jiao Tong University, China)

0
Cloud computing develops rapidly in recent years and provides service to many applications, in which job scheduling becomes more and more important to improve the quality of service. Parallel processing on cloud requires different machines starting simultaneously on the same job and brings processing slowdown due to communications overhead, defined as synchronization constraint and parallel slowdown. This paper investigates a new job scheduling problem of makespan minimization on uniform machines and identical machines with synchronization constraint and parallel slowdown. We first conduct complexity analysis proving that the problem is difficult in the face of adversarial job allocation. Then we propose a novel job scheduling algorithm, United Wrapping Scheduling (UWS), and prove that UWS admits an O(log m)-approximation for makespan minimization over m uniform machines. For the special case of identical machines, UWS is simplified to Sequential Allocation, Refilling and Immigration algorithm (SARI), proved to have a constant approximation ratio of 8 (tight up to a factor of 4). Performance evaluation implies that UWS and SARI have better makespan and realistic approximation ratio of 2 compared to baseline methods United-LPT and FIFO, and lower bounds.
Speaker Dejun Kong (Shanghai Jiao Tong University)

Dejun Kong is a Ph. D. candidate of Shanghai Jiao Tong University. His research area includes scheduling algorithm, distributed computing and data analytics.


Time and Cost-Efficient Cloud Data Transmission based on Serverless Computing Compression

Rong Gu and Xiaofei Chen (Nanjing University, China); Haipeng Dai (Nanjing University & State Key Laboratory for Novel Software Technology, China); Shulin Wang (Nanjing University, China); Zhaokang Wang and Yaofeng Tu (Nanjing University of Aeronautics and Astronautics, China); Yihua Huang (Nanjing University, China); Guihai Chen (Shanghai Jiao Tong University, China)

0
Nowadays, there exists a lot of cross-region data transmission demand on cloud. Serverless computing is a promising technique to compress data before transmission. However, it is challenging to estimate the data transmission time and monetary cost with serverless compression. In addition, minimizing the data transmission cost is non-trivial due to enormous parameter space. This paper focuses on this problem, and makes following contributions: (1) We propose an empirical data transmission time and monetary cost model based on serverless compression. (2) For single-task cloud data transmission, we propose two efficient parameter search methods based on Sequential Quadratic Programming and Eliminate then Divide and Conquer with error upper bounds. (3) Furthermore, a parameter search method based on dynamic programming and numerical computation is proposed to reduce the algorithm complexity from exponential to linear complexity for concurrent multi-task scenarios. We implement the actual system and evaluate it with various workloads and application cases on the real-world AWS cloud. Experimental results show that the proposed approach can improve parameter search efficiency by more than 3× compared with the state-of-art methods and achieves better parameter quality. Compared with other competing transmission approaches, our approach is able to achieve higher time efficiency and lower monetary cost.
Speaker Rong Gu (Nanjing University)

Rong Gu an assistant professor in the Department of Computer Science and Technology at Nanjing University. My research interests include Cloud and Big Data computing systems, efficient Cache/Index systems, Edge systems, etc. I have published over 40 papers in USENIX ATC, ICDE, WWW, INFOCOM, VLDBJ, IEEE TPDS, TNET, TMC, IPDPS, ICPP, IWQoS, DASFAA, and published a monograph. I received the IEEE TCSC Award for Excellence in Scalable Computing (Early Career), IEEE HPCC 2022 Best Paper Award (first author), the first prize of Jiangsu Science and Technology Prize in 2018, Tecent Cloud Valuable Professional (TVP) Award in 2021, the first place of the 30th SortBenchmark Competition CloudSort Track (Record Holder). My research results have been adopted by a number of well-known open source software such as Apache Spark, Alluxio, and leading IT/domain companies, including Alibaba, Baidu, Tencent, ByteDance, Huatai Securities, Intel, Sinopec, Weibo and so on. I am the community chair of the Fluid open source project (CNCF Sandbox project), a founding PMC member & maintainer of Alluxio (formly Tachyon) open source project. I am also the co-program chair of 15th IEEE iThings,the co-chair of 23rd ChinaSys, TPC member of SOSP’21/OSDI’22/USENIX ATC’22 Artifacts、AAAI’20、IEEE IPDPS’22.


Enabling Age-Aware Big Data Analytics in Serverless Edge Clouds

Zichuan Xu, Yuexin Fu and Qiufen Xia (Dalian University of Technology, China); Hao Li (China Coal Research Institute, China)

0
In this paper, we aim to fill the gap between serverless computing and mobile edge computing, via enabling query evaluations for big data analytics in short-lived functions of a serverless edge cloud (SEC). Specifically, we formulate novel age-aware big data query evaluation problems in a SEC so that the age of data is minimized, where the age of data is referred to the time difference between the finish time of analyzing a dataset and the generation time of the dataset. We propose approximation algorithms for the age-aware big data query evaluation problem with a single query, by proposing a novel parameterized virtualization technique that strives for a fine trade-off between short-lived functions and large resource demands of big data queries. We also devise an online learning algorithm with a bounded regret for the problem with multiple queries arriving dynamically and without prior knowledge of resource demands of the queries. We finally evaluate the performance of the proposed algorithms by extensive simulations. Simulation results show that the performance of our algorithms is promising.
Speaker Yuexin Fu

Yuexin Fu is a Master candidate at Dalian University of Technology. His research interests include edge computing and serverless computing.


Session Chair

Li Chen

Session G-10

Miscellaneous

Conference
3:30 PM — 5:00 PM EDT
Local
May 19 Fri, 12:30 PM — 2:00 PM PDT
Location
Babbio 221

CLP: A Community based Label Propagation Framework for Multiple Source Detection

Chong Zhang and Luoyi Fu (Shanghai Jiao Tong University, China); Fei Long (Chinaso, China); Xinbing Wang (Shanghai Jiaotong University, China); Chenghu Zhou (Shanghai Jiao Tong University, China)

1
Given an aftermath of an information spreading, i.e., an infected network after the propagation of malicious rumors, malware or viruses, how can we identify the sources of the cascade? Answering this problem, which is known as multiple source detection (MSD) problem, is critical whether for forensic use or insights to prevent future epidemics.
Despite recent considerable effort, most of them are built on a preset propagation model, which limits their application range. Some attempts aim to break this limitation via a label propagation scheme where the nodes surrounded by large proportions of infected nodes are highlighted. Nonetheless, the detection accuracy may suffer since the node labels are simply integers with all infected or uninfected nodes sharing the same initialization setting respectively, which fall short of sufficiently distinguishing the structural properties of them. To this end, we propose a community based label propagation (CLP) framework that locates multiple sources through exploiting the community structures formed by infected subgraph of different sources. Besides, CLP tries to enhance the detection accuracy by incorporating node prominence and exoneration effects. As such, CLP is applicable in more propagation models. Experiments on both synthetic and real-world networks further validate the superiority of CLP to the state-of-the-art.
Speaker Chong Zhang

Chong Zhang received his B.E. degree in Telecommunications Engineering from Xidian University, China, in 2018. He is currently pursuing the Ph.D. degree in Department of Electronic Engineering in Shanghai Jiao Tong University, Shanghai, China. His research of interests are in the area of social networks and data mining. 



GinApp: An Inductive Graph Learning based Framework for Mobile Application Usage Prediction

Zhihao Shen, Xi Zhao and Jianhua Zou (Xi'an Jiaotong University, China)

0
Mobile application usage prediction aims to infer the possible applications (Apps) that a user will launch next. It is critical for many applications, e.g., system optimization and smartphone resource management. Recently, graph based App prediction approaches have been proved effective, but still suffer from several issues. First, these studies cannot naturally generalize to unseen Apps. Second, they do not model asymmetric transitions between Apps. Third, they are hard to differentiate the contributions of different App usage context on the prediction result. In this paper, we propose GinApp, an inductive graph representation learning based framework, to resolve these issues. Specifically, we first construct an attribute-aware heterogeneous directed graph based on App usage records, where the App-App transitions and times are well modeled by directed weighed edges. Then, we develop an inductive graph learning based method to generate effective node representations for the unseen Apps via sampling and aggregating the information from neighboring nodes. Finally, our App usage prediction problem is reformulated as a link prediction problem on graph to generate the Apps with the largest probabilities as prediction results. Extensive experiments on two large-scale App usage datasets reveal that GinApp provides the state-of-the-art performance for App usage prediction.
Speaker Zhihao Shen (Xi'an Jiaotong University)

Zhihao Shen received his B.E. degree in automation engineering from School of Electronic and Information, Xi'an Jiaotong University, Xi'an, China, in 2016, where he is currently pursuing the Ph.D. degree with the Systems Engineering Institute. His research interests include mobile computing, big data analytics, and deep learning.


Cost-Effective Live Expansion of Three-Stage Switching Networks without Blocking or Connection Rearrangement

Takeru Inoue and Toru Mano (NTT Network Innovation Labs., Japan); Takeaki Uno (National Institute of Informatics, Japan)

0
The rapid growth of datacenter traffic requires network expansion without interrupting communications within and between datacenters. Past studies on network expansion while carrying live traffic have focused on packet-switched networks such as Ethernet. Optical networks have been attracting attention recently for datacenters due to their great transmission capacity and power efficiency, but they are modeled as circuit-switched networks. To our knowledge, no practical live expansion method is known for circuit-switched networks; the Clos network, a nonblocking three-stage switching structure, can be expanded, but it involves connection rearrangement (interruption) or significant initial investment. This paper proposes a cost-effective network expansion method without relying on connection rearrangement. Our method expands a network by rewiring inactive links to include additional switches, which also avoids the blocking of new connection requests. Our method is designed to only require switches whose number is proportional to the network size, which suppresses the initial investment. Numerical experiments show that our method incrementally expands a circuit-switched network from 1,000 to 30,000 ports, sufficient to accommodate all racks in a today's huge datacenter. The initial structure of our method only consists of three 1024x1024 switches, while that of a reference method requires 34 switches.
Speaker Takeru Inoue (NTT Labs, Japan)

Takeru Inoue is a Distinguished Researcher at Nippon Telegraph and Telephone Corporation (NTT) Laboratories, Japan. He received the B.E. and M.E. degrees in engineering science and the Ph.D. degree in information science from Kyoto University, Japan, in 1998, 2000, and 2006, respectively. In 2000, he joined NTT Laboratories. From 2011 to 2013, he was an ERATO Researcher with the Japan Science and Technology Agency, where his research focused on algorithms and data structures. Currently, his research interests widely cover the reliable design of communication networks. Inoue was the recipient of several prestigious awards, including the Best Paper Award of the Asia-Pacific Conference on Communications in 2005, the Best Paper Award of the IEEE International Conference on Communications in 2016, the Best Paper Award of IEEE Global Communications Conference in 2017, the Best Paper Award of IEEE Reliability Society Japan Joint Chapter in 2020, the IEEE Asia/Pacific Board Outstanding Paper Award in 2020, and the IEICE Paper of the Year in 2021. He serves as an Associate Editor of the IEEE Transactions on Network and Service Management.


ASR: Efficient and Adaptive Stochastic Resonance for Weak Signal Detection

Xingyu Chen, Jia Liu, Xu Zhang and Lijun Chen (Nanjing University, China)

0
Stochastic resonance (SR) provides a new way for weak-signal detection by boosting undetectable signals with added white noise. However, existing work has to take a long time to search optimal parameter settings for SR, which cannot fit well some practical applications. In this paper, we propose an adaptive SR scheme (ASR) that can amplify the original signal at a low cost in time. The basic idea is that we find that the potential parameter is a key factor that determines the performance of SR. By treating the system as a feedback loop, we can dynamically adjust the potential parameters according to the output signals and make SR happens adaptively. ASR answered two technical questions: how can we evaluate the output signal and how can we tune the potential parameters quickly towards the optimal. In ASR, we first design a spectral-analysis based solution to examine whether SR happens using continuous wavelet transform. After that, we reduce the parameter tuning problem to a constrained non-linear optimization problem and use the sequential quadratic programming to iteratively optimize the potential parameters. We implement ASR and apply it to assist respiration-rate detection and machinery fault diagnosis. Extensive experiments show that ASR outperforms the state-of-the-art.
Speaker Xingyu Chen (Nanjing University)

Xingyu Chen is currently a Ph.D. student with the Department of Computer Science and Technology at Nanjing University of China. His research interests focus on indoor localization and RFID system. He is a student member of the IEEE.


Session Chair

Zhangyu Guan


Gold Sponsor


Gold Sponsor


Bronze Sponsor


Student Travel Grants


Student Travel Grants


Local Organizer

Made with in Toronto · Privacy Policy · INFOCOM 2020 · INFOCOM 2021 · INFOCOM 2022 · © 2023 Duetone Corp.